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1.  Introduction 

This  report  will  summarize  the  research  work  that  was  performed  at  Stanford  University 
under  ARPA  Order  No.  3423,  Contract  MDA903-80-C-107.  ^The  overall  purpose  of  the 
contract  was  to  support  basic  and  applied  research  into  the  science  and  engineering  of 
knowledge-based  systems.  This  work  had  four  major  subcomponents: 


-  1)  Basic  Research  in  Knowledge  Acquisition,  Representation,  Utilization,  and 
Evaluation’, 

2.  Development  of  Tools  to  Facilitate  the  Process  of  Knowledge-Based  Systems 
Construction 

3i  Propagation  of  the  Tools  and  Techniques  of  th’S  Discipline  to  Other  Areas  of  DOD 

Interest  a  .  . 

>  ■  1  •  > 

4,  Initial  Work  on  the  Application  of  Knowledge  Engineering  Methodologies  to 
Problems  in  VLSI  Design.-^ 

All  of  the  research  in  the  first  three  subcomponents  and  a  portion  of  the  VLSI  Design  work 
were  performed  directly  v/ithin  -the  Heuristic  Programming  Project  (HPP),  Principal 
Investigators  Professors  Edward  Feigenbaum  and  Bruce  Buchanan.  Associated  with  the  VLSI 
Design  work  was  a  separate  project  in  VLSI  Theory,  Principal  Investigated  Professor  Jeffrey 
Ullman.  Most  of  the  research  was  performed  using  the  facilities  or  the  NIH-supported 
SUMEX-AIM  computer  system  (a  Digital  Equipment  Corporation  duaUKl-10  processor  system). 
This  contract  did  fund  the  acquisition  of  a  DEC  VAX  11/780  LaciTity  which  supported  some 
of  the  basic  research  into  heuristic  methods  as  well  as  the  VLSKfheory  work. 

The  remainder  of  this  report  will  discuss  both  the"'TlPP  research  and  the  VLSI  Theory 
research,  along  with  resulting  publications. 


2.  HPP  Research 

/ 


2.1.  Introduction 

The  HPP  is  a  group  of  professors,  research  scientists,  programmers,  and  students  within  the 
Stanford  University  Computer  Science  Department.  It  has  long-standing  research  goals  of 
applying  the  methodologies  of  applied  artificial  intelligence  to  difficult  problems  in  science, 
engineering,  and  medicine.  To  accomplish  these  goals,  HPP  work  ranges  from  basic  research  in 
the  fundamentals  of  knowledge  representation  and  acquisition,  to  specific  projects  to  produce 
functional  problem-solving  system  in  areas  as  diverse  as  infectious  disease  treatment  and  design 
of  integrated  circuits.  ^ 

In  the  context  of  this  ARPA  contract,  HPP  work  was  funded  in  the  following  seven  areas: 


1.  Research  on  the  Representation,  Acquisition,  and  Use  of  Knowledge^ 

2.  Generalization  of  Knowledge- Based  System  Design  and  Implementation  Techniques 
(The  AGE  Project)^ 

31  Research  in  Experiment  Planning  (The  MOLGEN  Project)  j  - -  'j  ^ 
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\ 

'  4 i  Development  of  Knowledge-Based  3-D  Signal  Understanding  Techniques  (The 
CRYSAL1S  Project) 

5.  Knowledge-Based  VLSI  Design 

6.  Creation  of  the  Handbook  of  Artificial  Intelligence '  /  l'-" 

7 .  Transfer  of  AI  Technology  to  Other  Areas  of  DOD  Interest  ,  \  CT 

For  each  of  these  areas,  the  work  that  resulted  during  the  course  of  the  contract  will  he 
described  along  with  pointers  to  the  relevant  HPP  publications.  A  complete  bibliography  of 
HPP  publications  for  the  period  of  the  contract  is  given  at  the  end  of  this  section. 


2.2.  Research  on  the  Representation,  Acquisition,  and  Use  of  Knowledge 

In  this  period,  our  basic  research  on  the  representation,  use  and  acquisition  of  knowledge 
focused  on  several  large  systems,  described  below  Numerous  publications  address  tlicse  issues 
ir.  the  context  of  these  systems.  In  addition,  these  issues  are  also  central  to  other  work  on 
AGE,  Experiment  Planning,  and  VLSI  Design  discussed  elsewhere.  These  have  been,  and 
continue  to  be,  central  issues  in  all  of  artificial  intelligence. 

The  systems  in  which  this  research  was  accomplished  are  important  in  their  own  right.  One 
of  the  fundamental  conclusions  of  this  work  is  that  construction  of  a  large,  intelligent  system 
depends  on  the  separation  of  domain-specific  knowledge  from  the  "inference  engine"  that  uses 
that  knowledge.  The  systems  listed  below  are  demonstrations  of  the  power  of  this  conclusion. 
Publications  listed  with  them  discuss  other  conclusions  and  problems. 


2.2.1.  The  Family  of  EMYCIN  Systems  (EMYCIN,  ROGET,  PUFF) 

The  research  on  EMYCIN,  and  subsequent  dissemination  of  EMYCIN  to  many  academic, 
industrial,  and  military  sites,  demonstrates  the  power  of  using  a  simple,  rule-based 
representation  of  knowledge  and  a  simple,  backward-chaining  inference  mechanism.  It  was  the 
first  system  to  achieve  the  clear  separation  of  inference  procedures  from  knowledge  base,  and 
the  first  to  define  a  domain-independent  "shell"  into  which  knowledge  of  many  different 
domains  could  be  encoded.  Simply  put,  EMYCIN  resulted  from  taking  the  powerful  rule-based 
structure  of  MYCIN,  along  with  associated  explanation  and  knowledge  acquisition  facilities,  but 
removing  all  domain-specific  knowledge  (infectious  disease  diagnosis  and  therapy  for  MYCIN). 
The  EMYCIN  work  was  co-funded  by  NSF  Grant  MCS79-03753.  For  further  information,  see 
HPP-80-11,  HPP-80-22,  HPP-81-16,  and  HPP-81-27. 

PUFF  was  one  of  the  first  demonstration1!  ‘hat  the  EMYCIN  framework  could  be  used  in 
domains  other  than  the  domain  of  the  orig  r-al  i'^YCIN  system.  It  showed  the  applicability  of 
EMYCIN  to  diagnostic  problems  generally.  i  also  showed  a  mechanism  for  technology 
transfer  of  AI  ideas  from  the  development  c.,,ironment  to  the  environment  of  routine  use. 
PUFF  diagnosed  lung  diseases  of  various  types  and  was  successfully  tested  in  a  clinical  setting 
at  the  Pacific  Medical  Center  by  Dr.  Robert  Fallat.  It  was  co-funded  by  the  NIH  Institute  of 
Medical  Sciences,  Grant  2522-2-02.  For  further  information,  see  HPP-82-13. 

ROGET  was  an  experimental  system  whose  focus  is  knowledge  engineering,  the  method  of 
acquiring  knowledge  interactively  through  a  dialogue  between  a  subject  expert  and  a  software 
specialist  (called  a  knowledge  engineer).  ROGET  encodes  knowledge  about  the  knowledge 
engineering  process  in  order  to  carry  on  a  dialogue  itself  with  a  subject  expert.  While  previous 
work  (e.g.,  TEIRESIAS)  had  focused  on  semi-automatic  refinement  of  a  knowledge  base  already 
designed  and  largely  written,  the  research  on  ROGET  focused  on  acquiring  the  initial 
conceptual  structure  needed  to  design  the  knowedge  base  in  the  first  place.  ROGET's  special 
expertise  was  in  the  construction  of  systems,  like  PUFF,  that  are  constructed  in  EMYCIN.  It 
demonstrated  that  some  parts  of  this  knowledge  engineering  process  can  be  automated,  and  also 
highlighted  many  parts  of  it  that  are  still  more  an  art  than  a  science.  For  further  information, 
see  HPP-83-24. 
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2.2.2.  Variations  on  the  EMYCIN  Architecture  (CENTAUR,  VM) 

Two  systems  were  developed  in  this  time  period  that  are  variations  on  the  simple 
representation  and  inference  mechanism  used  in  EMYCIN.  CENTAUR  addresses  the  problem 
of  encoding  strategy  (or  control)  knowlege  in  a  rule-based  system  and  the  issue  of  the 
expressive  power  of  rule-based  representations.  It  uses  a  combination  of  rules  and  frames  to 
separate  (a)  the  knowledge  that  makes  inferences  from  data  from  (b)  the  knowledge  that 
controls  the  focus  of  attention  of  the  whole  system.  For  further  information,  see  HPP-80-17. 

VM  addresses  the  problem  of  making  inferences  that  relate  events  in  time.  While  EMYCIN 
implicitly  assumes  that  the  data  constitute  a  "snapshot"  of  a  situation  at  a  moment  in  time, 
VM  is  written  to  make  successive  inferences  using  data  that  are  continually  arriving  over  time. 
VM  was  successfully  tested  in  the  realtime  environment  of  a  surgical  recovery  intensive  care 
unit.  It  was  co-funded  by  NIH  Institute  of  Medical  Sciences  Grant  2522-2-02.  For  further 
information,  see  HPP-80-31. 

2.2.3.  EURISKO,  GLISP,  MRS 

These  are  three  additional  domain-independent  reasoning  systems  on  which  considerable 
research  was  performed.  EURISKO  uses  a  frame-based  representation  of  starting  assumptions 
and  definitions  in  order  to  discover  new  knowledge  in  a  domain  through  opportunistic  search 
of  the  interesting  combinations  of  the  primitive  concepts.  It  is  a  domain-independent 
outgrowth  of  the  AM  (Automated  Mathematician)  work  of  Professor  Douglas  Lenat. 
EURISKO  was  tested  on  domains  as  varied  as  war  gaming  and  integrated  circuit  design 
architectures.  It  was  co-funded  by  ONR  Contract  N00014-80-C-0609.  For  further 
information,  see  HPP-80-26  HPP-81-22,  HPP-82-25,  and  HPP-82-26. 

GLISP  is  a  large  package  built  on  InterLisp  that  provides  a  conversational  environment  in 
which  object-oriented  systems  can  be  constructed.  It  manages  context  and  provides  a  graphical 
interface  that  frees  the  knowledge  system  programmer  from  many  low-level  programming 
concerns.  GLISP  was  built  by  Professor  Gordon  Novak  of  the  University  of  Texas  while  a 
visiting  faculty  member  at  Stanford.  For  further  information,  see  HPP-82-1. 

MRS  is  a  framework  for  building  reasoning  systems  within  a  logic-based  representation.  It 
provides  a  toolkit  for  predicate  calculus  systems  that  was  previously  available  only  for  rule  and 
frame-based  systems.  Included  with  MRS  are  a  diverse  repertory  of  commands  for  asserting 
and  retrieving  information,  with  various  representations,  inference  techniques,  and  search 
strategies.  What  differentiates  MRS  from  other  knowledge  representation  systems  is  its  ability 
to  observe,  reason  about,  and  control  its  own  activity.  Within  MRS,  the  system  is  treated  as  a 
domain  in  its  own  right.  MRS  work  was  co-funded  by  ONR  Contract  N00014-81-K-0004.  For 
further  information  see  HPP-80-18,  HPP-80-24,  HPP-82-27,  and  HPP  83-26. 


2.3.  The  AGE  Project 

The  primary  objective  of  the  AGE  (Attempt  to  GEneralize)  Project  was  to  build  a  software 
laboratory  designed  to  speed  up  the  process  of  constructing  expert  systems.  This  involved  two 
major  tasks: 

•  Tool  Building— to  isolate  the  inference,  control,  and  representation  techniques  used 
in  other  expert  systems  and  extract  the  domain  independent  portions  for  use  in  new 
domains. 

•  User  Interface  Building--to  build  an  intelligent  front-end  to  guide  the  user  in 
constructing  expert  systems  with  the  domain  independent  tools. 


2.3.1.  System  Organization 

The  current  AGE  system  provides  the  user  with  a  set  of  pre-programmed  modules  called 
components.  Using  different  combinations  of  components,  the  user  can  build  a  variety  of 
programs  that  display  different  problem-solving  behavior.  AGE  also  provide  user  interface 
modules  that  help  the  user  in  constructing  and  specifying  the  details  of  the  components. 

A  component  is  a  collection  of  functions  and  variables  that  support  conceptual  entities  in 


4 


program  form.  For  example,  a  production  rule,  as  a  component,  consists  of  a  rule  interpreter 
and  various  strategies  for  rule  selection  and  execution.  The  components  in  AGE  have  been 
selected  and  modularized  to  be  useable  in  combinations.  For  novice  users,  AGE  currently 
provides  the  user  two  predefined  configurations  of  components  called  frameworks.  One,  called 
the  Blackboard  framework,  is  for  building  programs  based  on  the  Blackboard  Model  with  a 
globally  accessible  data  structure  called  a  blackboard  and  independent  sources  of  knowledge 
which  cooperate  to  form  hypotheses.  The  other  framwork,  called  the  Backchain  framework,  is 
for  building  programs  that  use  backward-chaind  production  rules  as  the  primary  mechanism  of 
generating  inferences. 

To  support  the  user  in  the  selection,  specification,  and  use  of  compoents,  AGE  is  organized 
around  five  major  subsystems  that  interact  in  various  ways.  These  subsystems  are:  Browse, 
Design,  Acquisition,  Interpreter,  and  Explanation.  A  system  executive  allows  the  user  to  access 
the  subsystems  through  the  use  of  menu  selection. 

The  Browse  and  Design  subsystems  help  familiarize  the  user  with  AGE  and  guide  him  in  the 
construction  of  expert  systems  through  the  use  of  predefined  framworks.  The  Acquisition 
subsystem  is  a  collection  of  interface  modules  that  help  the  user  specify  the  var'ous 
components  of  the  framework.  The  Interpreter  subsystem  is  designed  for  executing,  testing, 
and  refining  the  user  program.  The  Explanation  subsystem  answers  questions  about  the 
execution  of  the  user's  program. 

In  addition,  AGE  provides  an  interface  to  the  the  Units  System  (developed  within  the 
MOLGEN  Project  at  the  HPP),  an  object-oriented,  frame-based  knowledge  representation  tool. 
This  gives  the  user  additional  flexibility  in  representing  knowledge  as  well  as  providing  an 
alternative  hypothesis  structure  for  blackboard-based  systems. 


2.3.2.  AGE  Status 

AGE  is  fully  implemented  as  a  first-generation  system.  It  has  been  used  in  medical  domains 
like  the  PUFF  (Pulmonary  Function)  system  and  signal  understanding  domains  like 
HASP/SIAP  (a  system  for  locating  ships  at  sea  based  upon  sonar  signals).  It  is  especially 
useful  for  comparing  problem-solving  problems  on  a  single  problem;  PUFF1  was  tested  using 
the  built-in  forward-chaining  and  backward-chaining  frameworks  and  a  custom  assembled 
model-based  framework.  Current  work  is  proceeding  on  an  AGE-II  systems  which  is  totally 
useable  by  a  domain  expert  with  no  applied  artificial  intelligence  training. 


2.4.  The  MOLGEN  Project 

The  MOLGEN  project  is  a  joint  effort  among  computer  scientists  and  molecular  biologists  to 
explore  applications  of  artificial  intelligence  to  problems  in  the  rapidly  expanding  domain  of 
molecular  genetics.  A  central  them"  of  this  period  of  the  research  was  the  problem  of 
designing  laboratory  experiments:  producing  an  ordered  list  of  steps,  that  when  implemented 
in  a  biological  laboratory,  would  satisfy  a  given  goal  in  analysis  or  synthesis. 

During  the  first  year  of  this  period,  initial  work  was  completed  on  two  quite  different 
planning  systems.  The  first  was  based  upon  a  study  on  how  human  scientists  design 
experiments.  This  resulted  in  a  theory  known  as  skeletal  plans,  the  idea  that  almost  all 
experimental  designs  resulted  from  the  instantiation  of  an  abstract  design  with  specific 
laboratory  steps  suitable  to  the  exact  strategic  and  environmental  conditions  of  the  experiment. 
The  skeletal  plans  themselves  were  taken  as  part  of  the  scientist's  personal  knowledge  base,  not 
to  be  recreated  for  each  new  experiment;  for  example,  there  is  strong  evidence  that  a  single 
abstract  design  for  cloning  experiments  forms  the  basis  for  the  great  majority  of  all  such 
experiments  performed.  The  design  system  operates  by  locating  a  potentially  relevant  skeletal 
plan  from  its  knowledge  base  and  then  refining  that  plan  by  proceeding  hierarchically  through 
a  separate  knowledge  base  of  laboratory  tools  and  techniques.  For  more  information  about  this 
system,  see  HPP-79-29. 

The  second  planning  system  was  based  more  upon  computer  science  theoretic  grounds, 
making  maximal  use  of  the  interactions  among  potential  plan  steps  and  the  contraints  they 
impose  upon  the  growing  plan.  This  work  led  to  a  design  methodology  called  constraint 
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propagation  and  a  global  structure  for  planning  known  as  metaplanning,  which  separated 
planning  decisions  into  domain-dependent  and  domain-independent  classes.  For  more 
information  about  this  system,  see  HPP-80-2. 

Both  of  the  above-mentioned  systems  were  operational  on  actual  problems  in  molecular 
biology,  however,  considerable  work  was  conducted  to  expand  the  scope  of  the  biological 
knowledge  base  to  allow  a  wider  variety  of  problems,  and  more  difficult  problems  to  be 
handled  by  the  systems.  The  knowledge  base  increased  in  size  by  an  order  of  magnitude  during 
this  time  period,  with  particular  emphasis  on  synthetic  problems  of  current  interest  to  our 
research  collaborators. 

In  addition,  new  work  was  begun  on  extracting  the  best  features  of  the  two  planning  systems 
and  producing  a  new,  second-generation  design  system.  This  system,  railed  SPEX,  was  finished 
at  the  end  of  the  contract  period,  and  was  tested  successfully  on  a  variety  of  problems  and 
domains.  For  more  information,  see  HPP-82-22. 

MOLGEN  work  was  co-funded  by  NSF  grant  ECS-80-16247. 


2.5.  The  CRYSALIS  Project 

The  task  of  interpreting  three-dimensional  signal  information  is  a  difficult  one  for 
knowledge- based  systems  because  of  the  combinatoric  explosion  that  results  from  the  vast 
amount  of  knowledge  that  must  be  applied  to  solve  all  real-world  problems  of  this  type. 
Research  into  this  problem  occurred  within  the  task  domain  of  X-ray  protein  crystallography 
under  the  CRYSALIS  Project.  Here  the  goal  is  to  interpret  a  three-dimensional  image  of  the 
electron  density  cloud  surrounding  a  molecule  in  order  to  determine  the  precise  location  of  the 
individual  atoms  which  form  the  molecule. 


2.5.1.  Problem-Solving  Architecture 

A  major  achievment  of  CRYSALIS  came  in  solving  the  problem  of  how  the  limited  resources 
of  a  signal  interpretation  sytem  should  be  allocated  when  many  plausible  choices  exist.  This  is 
known  as  the  focus-of -attention  problem.  A  solution  was  developed  using  the  expert’s  strategic 
knowledge  to  guide  the  system’s  problem-solving  activities.  These  control  heuristics  are 
represented  (as  is  all  other  knowledge  in  CRYSALIS)  as  production  rules  and  are  organized 
into  a  hierarchical  production  system..  Control  in  this  architecture  proceeds  from  the  top 
down,  through  many  levels  of  control  down  to  object-level  heuristics  at  the  bottom  of  the 
hierarchy.  Each  level  is  a  complete  production  system  that  examines  the  current  situation  and 
invokes  one  or  more  sets  of  rules  at  the  next  lower  level.  As  control  moves  from  general  to 
very  specific  strategies,  from  a  broad  to  a  very  narrow  view  of  the  situation,  focus  of  attention 
is  achieved  in  a  very  clear  and  efficient  manner. 

The  particular  architecture  chosen  was  the  blackboard  architecture  first  demonstrated  in  the 
HEARSAY-II  speech  understanding  system.  Knowledge  in  CRYSALIS  is  partitioned  into 
independent  knowledge  sources  (KSs).  These  KSs  communicate  by  means  of  a  global  database 
called  the  blackboard  which  contains  a  multi-level  hypothesis  structure  and  the  data.  The 
CRYSALIS  system  is  data-driven  in  that  KSs  react  only  to  changes  in  the  data  on  the 
blackboard.  An  important  point  is  that  the  strategic  rules  that  determine  the  flow  of  control 
from  one  level  of  the  hierarchy  to  another  are  themselves  KSs  in  the  blackboard  system 
(known  as  control  knowledge  sources). 


2.5.2.  Current  Status 

At  the  end  of  this  report  period,  CRYSALIS  was  a  successful  demonstration  system.  It  can 
solve  (i.e.  find  atom  locations)  for  medium-sized  protein  in  about  a  day  of  computing  on  a 
dual  DEC  KI-10  computer  system.  This  is  considerably  better  than  traditional  numeric  systems 
on  far  larger  computers.  The  system  contains  66  knowledge  sources  with  602  rules  of 
inference.  The  system  is  integrated  with  several  FORTRAN  pre-processing  programs  that 
skeletonize  the  orginal  electron  density  map,  find  its  critical  points,  and  create  the  data 
representations  used  as  input  by  CRYSALIS. 

CRYSALIS  work  was  co-funded  by  NSF  Grant  MCS79-33666.  Further  information  may  be 
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found  in  HPP-79-16  and  HPP-83-19. 


2.6.  The  Knowledge-Based  VLSI  Project--KB-VLSI 
The  KB-VLSI  project  is  directed  toward  the  development  of  AI  techniques  and  applications 
to  computer  aided  design  tools  for  integrated  circuit  design.  The  overall  goals  of  the  project 
are: 


•  To  identify  and  articulate  expert  knowledge  used  in  integrated  circuit  design. 

•  To  develop  methods  for  representing  and  reasoning  with  this  knowledge. 

•  To  develop  knowledge-based  expert  systems  for  assisting  in  the  integrated  circuit 
design,  test,  and  debug  cycle. 

In  a  broader  sense,  the  KB-VLSI  project  is  concerned  with  derivation  of  and  experimentation 
with  knowledge-based  system  paradigms  appropriate  for  design-synthesis  tasks.  The  major  part 
of  the  KB-VLSI  activity  during  the  course  of  this  contract  concentrated  on  the  development  of 
Palladio,  an  experimental,  but  operational,  knowledge-based  design  system. 

The  KB-VLSI  project  is  a  collaborative  project  involving  the  HPP,  Xerox  Palo  Alto  Research 
Center,  and  Fairchild  Advanced  Research  Labs. 


2.6.1.  AI  Issues 

The  domain  of  integrated  circuit  design  and  testing  is  particularly  rich  for  development  of 
and  experimentation  with  new  AI  techniques.  The  AI  research  areas  of  greatest  relevance  to 
the  project  are: 

•  The  "natural  language"  most  often  used  for  specifying  the  structure  of  an  IC  is 
graphical  rather  than  textual.  Few  current  AI  systems  exploit  the  power  and 
flexibility  of  high-resolution  graphics  devices.  Research  and  development  on 
intelligent  graphics  interfaces  for  entering,  editing,  storing,  and  perusing  complex 
knowledge  base  involves  a  spectrum  of  AI  concerns  ranging  from  cognitive 
psychology  through  formal  languages  and  knowledge  representation  systems.  Such 
interfaces  would  be  applicable  to  a  variety  of  AI  system,  and  they  are  necessary  for 
intelligent  CAD  systems. 

•  The  specification  of  an  IC  includes  structural  and  behavioral  descriptions  of  the 
circuit.  These  specifications  may  be  hierarchical  with  respect  to  a  part-of  hierarchy 
as  well  as  contain  descriptions  of  the  circuit  at  various  levels  of  abstraction. 
Current  representation  systems  are,  at  best,  only  marginally  capable  of  representing 
effectively  such  complex  specifications.  This  problem  is  further  complicated  by  the 
need  to  somtimes  concurrently  consider  alternative  specifications  of  a  circuit. 

•  Much  of  the  expert  knowledge  used  in  IC  design  is  in  the  form  of  tradeoffs  rather 
than  constraints.  Although  there  has  been  work  done  on  designing  with  constraints, 
little  has  been  done  on  using  tradeoffs. 


2.6.2.  Palladio's  Model  of  the  Design  Process 

The  creation  of  behavioral  and  structural  specifications  of  a  circuit  usually  involves  a 
sequence  of  transformations  from  abstract  specifications  to  more  detailed  implementational 
description.  For  example,  the  design  of  a  combinational  logic  circuit  may  involve  first 
transforming  a  specification  of  the  circuit  in  terms  of  boolean  equations  which  relate  inputs 
and  outputs  into  a  specification  in  terms  of  logic  gates  and  interconnection  networks.  This 
may  then  be  transformed  into  a  layout  specification  expressed  in  terms  of  "colored"  rectangles. 

A  useful  metaphor  for  this  transformation  process  is  that  design  is  search.  The  designer 
searches  in  a  solution  space  of  implementation  specifications.  Moves  in  this  space  are  design 
decisions.  Each  decision  involves  considering  alternative  implementations,  testing  the 
alternatives  against  the  contraints  and  goals  imposed  by  the  abstract  specifications,  and  using 
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tradeoffs  to  differentiate  between  "satisficing”  alternatives  and  to  resolve  conflicts  between 
incompatible  constraints  and  goals.  This  process  is  difficult  because  the  solution  space  is  large, 
the  generation  of  alternative  solutions  is  expensive,  information  is  incomplete,  and  it  is 
impossible  to  predict  all  of  the  consequences  of  a  decision. 


2.6.3.  Design  Hierarchies 

IC  designers  have,  in  part,  coped  with  the  difficulty  of  design  decisions  by  exploiting 
hierarchies  in  the  design  process.  One  powerful  technique  is  to  decompose  a  device  into  semi¬ 
independent  subdevices  and  to  focus  attention  on  each  individually.  For  example,  a  4-bit 
register  can  be  considered  as  four  1-bit  registers  and  interconnections.  Another  way  of 
partitioning  the  design  process  is  into  description  levels,  abstract  models  of  circuits.  Each 
description  level  provides  languages  for  describing  the  behavior  and  structure  of  a  device  which 
suppress  particular  details  of  physical  implementations  of  the  device.  This  reduces  the 
complexity  of  the  elements  in  a  solution  space  and  makes  generation  and  comparison  of 
alternatives  less  expensive. 

Description  levels  also  permit  a  designer  to  partition  concerns  by  concentrating  on  subclasses 
of  design  decisions.  For  example,  at  an  architectural  level  a  designer  can  work  out  certain 
storage  and  communication  decisions  before  worrying  about  power  considerations.  Currently, 
there  are  four  description  levels  in  Palladio:  Layout,  Clocked  Primitive  Switches  (CPS), 
Clocked  Registers  and  Logic  (CRL),  and  Linked  Module  Abstraction  (LMA).  Collectively, 
these  levels  factor  the  concerns  of  a  digital  designer. 

The  most  widely  used  description  level  in  integrated  circuit  design  is  the  artwork  or  layout 
level.  This  level  describes  circuits  in  terms  of  "colored  rectangles"  that  can  be  composed  to 
build  up  large  designs.  Associated  with  each  colored  rectangle  is  a  set  of  composition  rules, 
called  layout  design  rules.  These  rules  provide  a  shallow  model  of  composition  that  is  based 
on  a  deep  model  of  electrical  properties  and  fabrication  tolerances.  If  designers  follow  the 
rules,  their  designs  are  guaranteed  to  have  adequate  physical  spacing  on  a  chip. 

The  layout  level  has  several  properties  which  are  useful  for  the  synthesis  of  designs.  First, 
primitive  terms  can  be  combined  to  form  larger  terms  and  subsystems.  Second,  there  are 
composition  rules  which  define  allowed  combination  of  terms.  Third,  there  is  a  well 
characterized  set  of  bugs  which  are  avoided  when  the  composition  rules  are  obeyed.  At  the 
layout  level,  these  buss  correspond  to  the  function  and  performance  problems  caused  by 
incorrect  physical  spacing. 

The  other  three  more  abstract  description  levels  have  properties  analogous  to  those  those  of 
the  layout  level.  The  CPS  level  distinguishes  between  different  uses  for  logic  and  is  concerned 
with  the  digital  behavior  of  a  system.  Different  uses  of  logic  include  steering  logic,  clocking 
logic,  and  restoring  logic.  The  composition  rules  at  this  level  prevent  bugs  of  non-digital 
behavior  caused  by  charge  sharing  and  invalid  switching  levels.  The  CRL  level  is  concerned 
with  the  composition  of  combinational  register  logic.  The  composition  rules  at  this  level 
preclude  various  bugs  related  to  clocking  in  a  two-phase  system.  The  LMA  level  is  concerned 
with  the  sequencing  of  computational  events  in  a  digital  system.  It  describes  the  paths  along 
which  data  can  flow,  the  sequential  and  parallel  actiation  of  computations,  and  the  distribution 
of  registers.  The  composition  rules  at  this  level  preclude  bugs  such  as  starting  computations 
before  data  are  ready,  and  deadlock  bugs  that  arise  from  the  improper  use  of  shared  modules. 


2.6.4.  Design  Knowledge  Bases 

Much  of  the  design  of  ICs  is  done  by  using  parts  of  existing  designs,  possibly  with 
modification.  This  technique  exploits  the  fact  that  common  constructs  are  used  in  many 
circuits,  e.g.,  registers,  NAND  gates,  and  I/O  pads.  In  Palladio,  knowledge  about  previously 
defined  circuits  is  kept  in  community  knowledge  bases.  These  knowledge  bases  also  contain 
rules  about  the  composition  and  optimization  of  circuit  components,  For  example,  at  the  CPS 
level  we  have  developed  a  knowledge  base  that  includes  a  collection  of  prototype  logic  gates,  a 
set  a  rules  that  define  allowed  composition  of  these  gates,  and  a  set  of  optimization  rules  for 
reducing  various  costs  of  circuits  composed  of  networks  of  gates. 

The  use  of  community  knowledge  bases  in  Palladio  is  supported  by  the  LOOPS  system,  an 
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object  and  data  oriented  programming  system  implemented  in  InterLisp.  LOOPS  was  created, 
in  particular,  to  support  a  design  environment  in  which  knowledge  bases  are  shared  and  can  be 
incrementally  updated. 


2.6.5.  Design  Evolution 

The  design  of  an  integrated  circuit  is  an  evolutionary  process  that  follows  an  iterative  cycle: 
create  a  candidate  design,  test  the  design  against  current  requirements,  modify  the  design 
and/or  requirements  to  create  a  new  candidate  design.  A  design  system  should  have  facilities 
for  interactive  simulation  to  provide  rapid  feedback  between  proposed  changes  and  their 
exercise  on  test  cases. 

Within  Palladio,  we  have  built  interactive,  rule-based  symbolic  circuit  simulators.  These 
simulators  use  symbolic  reasoning  on  a  hierarchy  of  behavioral  and  structural  specifications  for 
a  circuit  in  order  to  predict  the  outputs  of  the  circuit  given  a  set  of  inputs.  The  simulators 
include  a  dynamic  display  capability. 


2.6.6.  Status  of  KB-VLSI 

The  supporting  framework  for  Palladio  has  been  fully  built.  This  includes  LOOPS,  a  high- 
level,  object-oriented  graphics  package  called  HILGA,  and  GLISP  (discussed  earlier  in  this 
report)  which  provides  LOOPS  with  optimized  data  and  procedure  access. 

Prototype  community  knowledge  bases  for  the  CPS  and  LMA  levels  are  completed.  Initial 
knowledge  bases  for  the  layout  and  CRL  levels  are  under  development.  A  rule-based  design 
editor  for  the  CPS  level  has  been  implemented. 

Rule-based  symbolic  simulators  for  the  LMA  and  CPS  levels  are  operational.  Palladio  is 
now  useful  for  the  design  of  at  least  simple,  "student-level"  integrated  circuits. 

Further  information  about  the  KB-VLSI  project  may  be  found  in  HPP-82-2,  HPP-82-5,  and 
HPP-82-11. 


2.7.  The  Handbook  of  Artificial  Intelligence 

Incorporating  the  efforts  of  nearly  200  computer  science  researchers  as  writers,  editors,  and 
reviewers,  the  Handbook  of  Artificial  Intelligence  is  an  encyclopedic  compilation  of  articles 
covering  the  entire  field  of  artificial  intelligence.  It  satisfies  the  urgent  need  for  AI  to  "go 
public,"  making  the  full  range  of  its  important  techniques  and  concepts  available  for  the  first 
time  to  the  rapidly  expanding  world  of  potential  users.  Its  scope,  readibility,  and  organization 
have  made  it  the  standard  reference  work  in  AI  for  both  newcomers  and  experienced  members 
of  the  research  community.  It  also  provides  the  most  comprehensive  survey  of  the  field's 
literature  available. 

The  work  consists  of  approximately  1500  pages  in  three  volumes.  Volume  I,  released  ir>  1981, 
contains  major  sections  on  search,  knowledge  representation,  and  understanding  natural  and 
spoken  language.  Volume  II,  released  in  1982,  discusses  AI  programming  languages, 
applications  of  AI  to  science,  medicine,  and  education,  and  automatic  programming.  Volume 
III,  also  released  in  1982,  contains  chapters  on  cognitive  models,  deduction,  vision,  learning, 
planning,  and  problem-solving. 

All  three  volumes  were  published  by  William  Kaufmann,  Inc.  of  Los  Altos,  California.  To 
date,  approximately  90,000  copies  have  been  sold.  Royalties  from  the  handbook  provide 
funding  for  HPP  students  to  attend  artificial  intelligence  conferences.  Co-funding  for  the 
production  of  the  handbook  was  provided  by  the  NIH  Bureau  of  Research  Programs.  Many  of 
the  individual  chapters  are  available  as  HPP  Reports. 


2.8.  Technology  Transfer 

The  HPP  has  maintained  a  consistent  record  of  active  transfer  of  its  ideas  and  systems  into 
the  academic,  governmental,  and  industrial  sectors.  For  all  of  the  projects  described  in  this 
report,  publications  have  appeared  both  in  the  computer  science  and  the  domain-specific 
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literature.  F*''  scientists  have  organized  and  participated  in  innumerable  symposia, 
conferences,  and  workshops. 

Many  of  the  systems  described  in  this  report  have  been  distributed,  at  copying  and  shipping 
cost,  to  other  sites.  In  particular,  EMCYIN,  AGE,  MRS,  and  GLISP  have  been  sent  to  over 
ICO  locations,  including  such  DOD-related  companies  as  Fairchild,  RCA,  ESL,  Mitre,  HP, 
Honeywell,  GTE,  Hughes,  IBM,  Lockheed,  NCR,  Boeing,  and  Systems  Control. 
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3.  VLSI  Theory/Silicon  Compilation  Project 

As  described  in  the  introduction  to  this  report,  the  VLSI  Theory/Silicon  Compilation  work 
was  performed  outside  of  the  HPP,  under  the  direction  of  Professor  Jeffrey  Ullman.  A 
separate  references  list  is  included  at  the  end  of  this  section  and  referred  to  within  the  text. 


3.1.  Regular  Expression  Compilation 

The  basic  idea  is  to  translate  regular  expressions  into  control  structures,  such  as  PLA’s.  The 
regular  expression  language  is  an  easy  to  describe  collections  of  processes  that  independently 
look  for  patterns  on  a  shared  input.  In  a  sense,  regular  expressions  are  an  alternative  to 
conventional  finite  state  machine  description  languages.  They  are  awkward  for  some  things, 
but  they  are  also  elegant  and  natural  for  other  purposes.  For  example,  they  are  a  natural  way 
to  describe  communication  protocols  or  processor  control  units. 

The  project  has  considered  and  tried  out  a  number  of  ways  to  translate  these  expressions  to 
silicon,  but  the  central  theme  is  that  regular  expressions  are  translated  into  nondeterministic 
automata,  and  the  nondeterministic  automata  have  their  states  encoded  in  a  way  that  allows  the 
automation  to  be  implemented  by  a  convetional  form  of  logic,  such  as  PLA's.  (See[l]  for 
definitiions  and  simple  introductions  to  the  necessary  concepts,  such  as  regular  expressions  and 
non-deterministic  automata.) 

The  first  attempt  to  build  a  regular  expression  compiler  was  based  on  the  idea  that  a  small 
circuit  could  be  laid  out  corresponding  to  each  operand,  and  these  circuits  could  be  wired 
together  in  systematic  ways  to  reflect  the  operators  of  the  expression.  The  ideas  were  described 
in  [2],  but  the  resulting  layouts  were  found  on  average  to  be  considerably  worse  than  the  other 
methods  we  tried  later. 

Our  next  approach  was  to  select  small  subexpressions  that  could  be  implemented  by  PLA’s  in 
a  simple  fashion,  and  wiring  the  PLA's  together  in  ways  that  reflect  the  operators  of  the 
expression,  as  before.  In  addition,  we  used  a  heuristic  search  for  the  "best"  subexpressions  to 
choose.  These  improvements  were  described  in  [3,4]. 

At  this  point,  we  realized  that  coding  the  states  of  the  nondeterministic  automata  that 
correspond  to  each  of  the  selected  subexpressions  is  a  critical  problem,  and  we  tried  a  number 
of  different  ways  to  find  efficient  codes.  [5]  discusses  an  early  attempt  and  also  discusses  the 
(deterministic)  "state"  feature,  that  enables  the  regular  expression  language  to  include 
conventional  finite  state  machine  languages  for  controller  specification.  That  paper  also 
described  a  brief  flirtation  with  the  Weinberger  array  oriented  approach  of  Steve  Johnson  (Bell 
Labs).  We  found  that  the  sizes  of  circuits  obtained  by  generating  Johnson's  lgen  language 
from  regular  expressions  was  comparable  to  what  we  obtained  by  our  own  PLA  method. 

Our  best  coding  method  to  date  is  described  in  [6].  With  this  approach,  we  generate  PLA’s 
that  are  superior  to  hand  designs  for  many  of  the  benchmark  problems  that  we  accumulated 
during  the  course  of  the  project. 


3.2.  Routing 

We  developed  a  provably  optimal  and  highly  efficient  algorithm  for  rivvr-routing  in  a 
rectangular  channel[7].  These  ideas  have  been  extended  to  wiring  rules  more  general  than 
rectangular  wiring,  e.g.,  rules  permitting  45-degree  wires,  and  more  general  configurations,  such 
as  the  "bristle  blocks"  problem,  where  the  sides  of  the  channel  consists  of  rigid  modules  able 
to  slide  horizontally  relative  to  one  another.  These  extensions  are  covered  in  [8-10], 


3.3.  PLA  Folding 

A  more  general  PLA  folder,  which  considers  the  possibility  that  input  wires  need  to  come  in 
truecomplement  pairs,  was  implemented[ll].  Like  most  PLA  folders,  it  uses  a  greedy  heuristic, 
finding  legal  folds  and  making  them.  A  graph-theoretic  representation  makes  the  test  for 
legality  of  a  fold  relatively  efficient. 
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3.4.  Plane  Embeddings 

Given  a  circuit  described  by  nodes,  e.g.,  logic  gates,  and  wires  connecting  them,  we  would  like 
to  lay  out  the  circuit  in  minimum  possible  area.  Paper  [12]  shows  how  to  lay  out  a  class  of 
such  circuits  in  area  proportional  to  the  number  of  nodes;  the  class  is  larger  than  that  handled 
by  previously  known  layout  algorithms. 


3.5.  VLSI-Oriented  Algorithms 

Paper  [13]  considers  algorithms  for  doing  a  number  of  important  operations  with  a  special- 
purpose  chip;  these  operations  are  graph-theoretic,  such  as  connected  components  finding  (that 
problem  is  equivalent  to  "circuit  extraction'*  from  a  CIF  layout).  There  are  a  number  of 
powerful  algorithms  ideas,  such  as  "funnel  pipelining,"  where  the  graph  is  progressibely 
transformed  by  combining  two  nodes  into  one,  so  after  log  n  stages,  an  n-node  graph  becomes 
trivial.  At  successive  stages,  the  time  to  perform  the  transformation  doubles,  so  the  total  work 
at  each  stage  is  the  same,  and  the  stages  can  be  implemented  by  a  pipeline  on  the  chip,  where 
all  stages  operate  in  parallel. 
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